Generating High-Performance General Size Linear Transform Libraries Using Spiral

نویسندگان

  • Yevgen Voronenko
  • Franz Franchetti
  • Frédéric de Mesmay
  • Markus Püschel
چکیده

Developing numerical libraries that achieve highest performance on modern computer architectures became an extremely difficult task due to the increasingly complicated microarchitectures, deep cache hierarchies, and different forms of onchip parallelism, such as multiple processor cores and SIMD short vector instruction sets. The difficulty of library development led to interest in automated tools that simplify the development of high-performance libraries, without sacrificing performance. Program generator Spiral [2] is an example of such tool for the domain of linear transforms, such as the discrete Fourier transform (DFT), FIR filters, and others. Spiral automatically generates the optimized and platform-adapted implementation given only the transforms specification (e.g. DFT1024) and a high-level description of the recursive divide-and-conquer algorithm in the domain-specific language called SPL (Signal Processing Language). Spiral performs optimizations such as vectorization and parallelization using rewriting on the high-level of abstraction provided by SPL, and also lower-level representations. To exploit the potential offered by the development automation tools, Intel Integrated Performance Primitives (IPP) library, which provides a wide number of optimized linear transform functions, starting with version 6.0, will include a special domain for the functions automatically generated by Spiral. To date Spiral was restricted to generating code for transforms of fixed size, known at generation time. In this paper we overview our latest research results [3] that enable generating full general size libraries, for which the transform size is only known at runtime.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

System Demonstration of Spiral: Generator for High-Performance Linear Transform Libraries

We demonstrate Spiral, a domain-specific library generation system. Spiral generates high performance source code for linear transforms (such as the discrete Fourier transform and many others) directly from a problem specification. The key idea underlying Spiral is to perform automatic reasoning and optimizations at a high abstraction level using the mathematical, declarative domainspecific lan...

متن کامل

TITLE Spiral

Spiral is a program generation system (software that generates other software) for linear transforms and an increasing list of other mathematical functions. The goal of Spiral is to automate the development and porting of performance libraries. Linear transforms include the discrete Fourier transform (DFT), discrete cosine transforms, convolution, and the discrete wavelet transform. The input t...

متن کامل

High Performance Linear Transform Program Generation for the Cell BE

The Cell BE is among a new generation of multicore processors including the Intel Larrabee and the Tilera TILE64 that provide an impressive peak fixed or floating point performance for scientific, signal processing, visualization, and other engineering applications. As shown in Fig. 1, the Cell uses simple in-order cores designed specifically for numerical computing, and requires explicit memor...

متن کامل

FFT Program Generation for the Cell BE

The complexity of the Cell BE’s architecture makes it difficult and time consuming to develop multithreaded, vectorized, high-performance numerical libraries. Our approach to solving this problem is to use Spiral, a program generation system, to automatically generate and optimize linear transform libraries for the Cell. To extend the Spiral framework to support the Cell architecture, we first ...

متن کامل

Generation of Custom DSP Transform IP Cores: Case Study Walsh-Hadamard Transform -

Hardware designers are increasingly relying on pre-designed DSP (digital signal processing) cores from IP libraries to improve their productivity and reduce design time. Unfortunately, static DSP cores cannot accommodate application-specific trade-offs. To overcome this problem, we are proposing to automatically generate customized DSP cores that can be tailored for specific design requirements...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008